On Confusions in a Phoneme Recognizer

نویسندگان

  • Andrew Lovitt
  • Joel Pinto
  • Hynek Hermansky
چکیده

In this paper, we analyze the confusions patterns at three places in the hybrid phoneme recognition system. The confusions are analyzed at the pronunciation, the posterior probability, and the phoneme recognizer levels. The confusions show significant structure that is similar at all levels. Some confusions also correlate with human psychoacoustic experiments in white masking noise. These structures imply that not all errors should be counted equally and that some phoneme distinctions are arbitrary. Understanding these confusion patterns can improve the performance of a recognizer by eliminating problematic phoneme distinctions. These principles are applied to a phoneme recognition system and the results show a marked improvement in the phone error rate. Confusion pattern analysis leads to a better way of choosing phoneme sets for recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study of Phonemes Confusions in Hierarchical Automatic Phoneme Recognition System

In this paper, we have analyzed the impact of confusions on the robustness of phoneme recognitions system. The confusions are detected at the pronunciation and the confusions matrices of the phoneme recognizer. The confusions show that some similarities between phonemes at the pronunciation affect significantly the recognition rates. This paper proposes to understand those confusions in order t...

متن کامل

Forward optimal modeling of acoustic confusions in Mandarin CALL system

Acoustic confusions degrade the accuracy of pronunciation assessment severely in Computer Assisted Language Learning (CALL) systems. This paper presents our recent study on optimal modeling of the acoustic confusions. We change the traditional mandarin syllable structure, which is composed of initial and final, to a novel phoneme structure. Several phoneme splitting strategies are investigated,...

متن کامل

Phonetic ignorance is bliss: Inves phonetic information reduction o

Perception studies have long argued that phonetic confusions are more likely to happen across some phonetic features than other (e.g., place of articulation rather than manner) [1]. Similarly, we and others have noted that pronunciation variation occurs more frequently in unstressed syllables, and in syllable codas. This suggests that a phonetic information structure is at play, where for decod...

متن کامل

Long short-term memory networks for noise robust speech recognition

In this paper we introduce a novel hybrid model architecture for speech recognition and investigate its noise robustness on the Aurora 2 database. Our model is composed of a bidirectional Long Short-Term Memory (BLSTM) recurrent neural net exploiting long-range context information for phoneme prediction and a Dynamic Bayesian Network (DBN) for decoding. The DBN is able to learn pronunciation va...

متن کامل

Impact of Vocal Tract Length Normalization on the Speech Recognition Performance of an English Vowel Phoneme Recognizer for the Recognition of Children Voices

Differences in human vocal tract lengths can cause inter speaker acoustic variability in speech signals spoken by different speakers for the same textual version and due to these variations, the robustness of a speaker independent (SI) speech recognition system is affected. Speaker normalization using vocal tract length normalization (VTLN) is an effective approach to reduce the affect of these...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007